| ind_id | score_bir |
|---|---|
| 192 | 3 |
| 289 | 6 |
| 104 | 2 |
| 131 | 5 |
| 242 | 4 |
Basline.csv
Data, Democracy & Development (DDD) summer field practice course
At Azim Premji University
19 May, 2025
I am Ayush.
I am a researcher working at the intersection of data, development and economics.
I am a RStudio (Posit) certified tidyverse Instructor.
I am a Researcher at Oxford Poverty and Human development Initiative (OPHI), at the University of Oxford.
Do you have a SurveyCTO account? if not, register for a free account here
Please download the SurveyCTO Collect app on your phone
Get a working understanding of how components of survey tech work.
Design and program a survey questionnaire to make it available for field use.
Extract the collected data for analyses.
Build an intuition on how to ask questions.
Advanced methods of communicating with servers using API.
Data encryption for sensitive data.
Building data pipelines that extract, clean and present analyses.
Sampling theory, and analyses.
and how it works
But Why should I know this stupid tech stuff? 1
Say, there is only one single question. Nothing else is of interest to you. Everyone and anyone can be asked this question.
What would the form definition then look like?
First (name,label), you need the exact phrasing of the question.
Second (type), what is the type of answer that you expect and accept? (a number, a date, an image?)
Third (constraint), any reasonable constraints on the answers? (an age range, a date range ?)
Create a copy of the file from the previous example.
Rename the file.
Open the newly renamed file. Remove all rows except the title rows from the survey sheet.
As discussed in the previous slide, add a single question in the survey sheet. Populate all necessary columns.
Remove all rows except the title row from the choices sheet.
Add row in choices sheet if needed.
Change the form title, and form ID in the settings sheet.
Upload the form on your server and test.
Before we move ahead..
Why?
Lets us calculate on the fly, apply conditional flows to questions, and validate answers.
There are two important ideas from the syntax point of view
First, referring to the value of a specific field.
Second, referring to the values the current field.
We shall look at how these are used.
In the trivial-biryani-survey I use ${enrolled} = 1 to ensure that questions are asked to respondents that report to be enrolled in college.
I also use .>16 and .<130 to ensure age values are validated? Do you think my range is a reasonable one?
Can you find other instances in the biryani survey where I use such expression?
How is the $ operator different from the . operator?
See this page to browse through applying other logical and mathematical operations.
The need to restrict answers to a reasonable set of options or numbers.
How many people live in your house: “56”
The constraints are implemented in the constraint column of the form definition
The constraints can be accompanied with a message for the enumerator. This can be done using the constraint message column of the form definition
Can you find the constraint implemented in the trivial-biryani-survey for number of times a respondent consumes biryani in a week?
It is usually of interest to see how certain people answer a question of our interest.
We can use expressions to change the flow of the survey questions so that we ask the right question to the right people.
expressions are provided in the relevance column for a question or group of questions.
Continue only on consent is the common, important, and easiest to implement relevance that is found in all decent surveys.
In the trivial-biryani-survey see the relevance column to see an example.
Create a form definition for a door-to-door household interview.
You are broadly interested in the access to banking and spending autonomy of stay-at-home married women.
You want to capture details of education, age, marital status,age at marriage of the target group and their spouse.
You want information on holding a bank account, debit cards, credit cards, UPI for the target group and their spouse.
For the target group you want to record who opened their account, Did they open an (additional or new) account after marriage.
Does the target group have any sources of income/savings independent of their spouse.
How often does the target group spend on indulgence expenses (not for their kids education, not for HH expenses, etc) only for themselves. Ask for reasons if they dont.
Begin by making a flow chart of questions, phrase the questions, use constraints and relevance as required.
Upload the form on SurveyCTO and test it, does it work as expected?
Organization
Easy Navigation
Less Duplication
Less copy paste errors
| type | name | label | relevance |
|---|---|---|---|
| begin group | gname | Group1 | ${child}=1 |
| ..fields.. | f1 | what… | |
| ..fields.. | f2 | how… | |
| end group |
| type | name | label | relevance | repeat_count |
|---|---|---|---|---|
| begin repeat | gname | Group1 | ${child}=1 | ${num_child} |
| ..fields.. | f1 | what… | ||
| ..fields.. | f2 | how… | ||
| end repeat |
| type | name | calculation | relevance |
|---|---|---|---|
| calculate | num_rand | once(random()) | |
| integer | treat | ${num_rand}<=0.5 | |
| integer | control | ${num_rand}>0.5 |
Allows us to use data, collected in previous surveys or otherwise, in survey forms.
Attach one or multiple supporting data files as csv or as spreadsheets or server data.
The attached data must have the first row as unique column names for the data.
There must exist a column to uniquely identify each row.
Every column from the the data that is required in the survey form gets its own calculate field in the form definition.
| ind_id | score_bir |
|---|---|
| 192 | 3 |
| 289 | 6 |
| 104 | 2 |
| 131 | 5 |
| 242 | 4 |
Basline.csv
You are interested in figuring out the right condiment for the new biryani that has been created.
As baseline, you have found a group of individuals that have been given a score to the new biryani, consumed on its own.
It is known that people who score less than 6 never order the product.
So, in your revisit you want to meet people who have scored it less than 6 and offer one of the two condiments at random (salan or raita).
Then ask them to score again. Finally, see if a condiment is able to get the average scores above 6.
pulldata() is the function that allows us to get additional data into the survey form.
pulldata(name of the file, column of interest, uid column in data,uid value to match)
pulldata(Baseline,score_bir,ind_id,${id}) is provided in calculation column of the form.
It is unreasonable to think that the respondent or the enumerator will recall their uid all the time with accuracy.
We need ability to search through the additional data to identify the right respondents.
The enumerator should have the ability to search through the possible respondents of a given area (district, village, ward, etc)
| district_name | district_id | village | village_id | hhid |
|---|---|---|---|---|
| D1 | 1 | v1 | 1 | 100 |
| D1 | 1 | v2 | 2 | 101 |
| D1 | 1 | v3 | 3 | 102 |
| D1 | 1 | v4 | 4 | 103 |
| D1 | 1 | v5 | 5 | 104 |
| D2 | 2 | v6 | 6 | 105 |
| D2 | 2 | v7 | 7 | 106 |
| D2 | 2 | v8 | 8 | 107 |
| D2 | 2 | v9 | 9 | 108 |
| D2 | 2 | v10 | 10 | 109 |
listing.csv
| type | name | label | apperance |
|---|---|---|---|
| select_one district | choose_dist | enumerator choose dist | search(“listing”) |
survey
| list_name | value | label |
|---|---|---|
| district | district_id | district_name |
choices
| type | name | label | apperance |
|---|---|---|---|
| select_one village | choose_village | enumerator choose village | search(“listing”, “matches”, “district_id”, ${choose_dist}) |
survey
| list_name | value | label |
|---|---|---|
| village | vilage_id | village |
choices
| type | name | label | apperance |
|---|---|---|---|
| select_one HH | choose_HH | enumerator choose HH | search(“listing”, “matches”, “district_id”, ${choose_village}) |
survey
| list_name | value | label |
|---|---|---|
| village | hhid | hhid |
choices
Email: ayush.ap58@gmail.com